{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "IN_COLAB = 'google.colab' in sys.modules\n",
    "if IN_COLAB:\n",
    "    !git clone https://github.com/cs357/demos-cs357.git\n",
    "    !mv demos-cs357/figures/ .\n",
    "    !mv demos-cs357/additional_files/ ."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import scipy.linalg as sla\n",
    "import numpy.linalg as la\n",
    "import random\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Obtaining eigenvalues and eigenvectors numerically"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We want to prepare a matrix with deliberately chosen eigenvalues. Let's use diagonalization to write the matrix $\\mathbf{A}$:\n",
    "\n",
    "$$ \\mathbf{A} = \\mathbf{U D U}^{-1} $$\n",
    "\n",
    "where we set ${\\bf D}$ to be a known matrix with the pre-defined eigenvalues:\n",
    "\n",
    "```python\n",
    "D = np.diag([lambda1, lambda2, ..., lambdan])\n",
    "```\n",
    "\n",
    "We need to generate a matrix $\\mathbf{U}$ that has an inverse. Orthogonal matrices are a great option here, since $\\mathbf{U}^{-1} = \\mathbf{U}^T$. We use QR decomposition to get an orthogonal matrix (you don't need to understand this method)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n = 4\n",
    "X = np.random.rand(n,n)\n",
    "U,_ = sla.qr(X)\n",
    "\n",
    "D = np.diag([6,2,4,7])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(U)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can use diagonalization to write $\\mathbf{A}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "A = U@D@U.T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can check that the eigenvalues are indeed what we expected:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eigl, eigv = la.eig(A)\n",
    "print(eigl)\n",
    "print(eigv)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We want to find the eigenvector corresponding to the largest eigenvalue in magnitude. For that, we can use `np.argsort`, which returns the indices that sort the array in ascending order. Hence, we are interested in the last entry."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eig_index_sort = np.argsort(abs(eigl))\n",
    "print(eig_index_sort)\n",
    "eigpos = eig_index_sort[-1]\n",
    "print(eigpos)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall that eigenvectors are stored as columns! Hence this would be the eigenvector corresponding to the largest (in magnitude) eigenvalue."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eigv[:,eigpos]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Power Iteration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's pick an initial vector:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x0 = np.random.randn(n)\n",
    "x0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we perform matrix-vector multiplications:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x = x0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = A@x\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Power iteration should converge to a multiple of the eigenvector ${\\bf u}_1$ corresponding to largest eigenvalue (in magnitude)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ {\\bf x}_k = (\\lambda_1)^k \\left[ \\alpha_1 {\\bf u}_1 + \\alpha_2 \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k{\\bf u}_2 + ...  \\right] $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's implememt power iteration. We simply perform multiple matrix vector multiplications using a for loop:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = x0\n",
    "for i in range(40):\n",
    "    x = A @ x\n",
    "    print(x)\n",
    "    \n",
    "print('Exact eigenvalue = ',eigv[:,eigpos])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* What's the problem with this method?\n",
    "* Does anything useful come of this?\n",
    "* How do we fix it?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can get the corresponding eigenvalue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "np.dot(x,A@x)/np.dot(x,x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Normalized power iteration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Back to the beginning: Reset to the initial vector and normalize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x = x0/la.norm(x0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implement normalized power iteration. We will start with 10 iterations, and see what happens..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "x = x0/la.norm(x0)\n",
    "\n",
    "for i in range(10):\n",
    "    x = A @ x\n",
    "    nrm = la.norm(x)\n",
    "    x = x/nrm\n",
    "    print(x)\n",
    "\n",
    "print('exact = ' ,eigv[:,eigpos])\n",
    "\n",
    "print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What if the starting guess does not have any component of ${\\bf u}_1$, i.e., if $\\alpha_1 = 0$? "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ {\\bf x}_k = (\\lambda_1)^k  \\alpha_1 {\\bf u}_1 + (\\lambda_1)^k  \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\alpha_2 {\\bf u}_2 + (\\lambda_1)^k \\left[   \\left(\\frac{\\lambda_3}{\\lambda_1}\\right)^k \\alpha_3{\\bf u}_3 +  ...  \\right] $$\n",
    "\n",
    "In theory (or infinite precision calculations), if $\\alpha_1=0$, power iteration will converge to a vector that is a multiple of the eigenvector ${\\bf u}_2$. \n",
    "\n",
    "\n",
    "In practice, it is unlikely that a random vector ${\\bf x}_0$ will not have any component of ${\\bf u}_1$. In the chances that happens, finite operations during the iterative process will usually introduce such component."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating a matrix where the dominant eigenvalue is negative"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n = 4\n",
    "    \n",
    "D = np.diag([-5,2,4,3])\n",
    "\n",
    "A = U@D@U.T\n",
    "\n",
    "eigl, eigv = la.eig(A)\n",
    "\n",
    "eig_index_sort = np.argsort(abs(eigl))\n",
    "eigpos = eig_index_sort[-1]\n",
    "\n",
    "print(eigl)\n",
    "print(eigv[:,eigpos])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x = x0/la.norm(x0)\n",
    "\n",
    "for i in range(40):\n",
    "    x = A @ x\n",
    "    nrm = la.norm(x)\n",
    "    x = x/nrm\n",
    "    print(x)\n",
    "\n",
    "print('exact = ' ,eigv[:,eigpos])\n",
    "\n",
    "print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))\n",
    "\n",
    "print(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What is happening here? Note that the scalar that multiplies the eigenvector ${\\bf u}_1$ in \n",
    "\n",
    "$$ {\\bf x}_k = (\\lambda_1)^k  \\alpha_1 {\\bf u}_1 + (\\lambda_1)^k  \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\alpha_2 {\\bf u}_2 + (\\lambda_1)^k \\left[   \\left(\\frac{\\lambda_3}{\\lambda_1}\\right)^k \\alpha_3{\\bf u}_3 +  ...  \\right] $$\n",
    "\n",
    "is $(\\lambda_1)^k$, and hence if the eigenvalue  $\\lambda_1$ is negative, the solution of power iteration will converge to the eigenvector, but with alternating signs, i.e., ${\\bf u}_1$ and $-{\\bf u}_1$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### When dealing with dominant eigenvalues of multiplicity greater than 1: $|\\lambda_1| = |\\lambda_2| $ and $ \\lambda_1, \\lambda_2 > 0 $\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n = 4\n",
    "\n",
    "D = np.diag([5,5,2,1])\n",
    "\n",
    "A = U@D@U.T\n",
    "\n",
    "eigl, eigv = la.eig(A)\n",
    "\n",
    "print(eigl)\n",
    "print(eigv[:,2])\n",
    "print(eigv[:,3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x = x0/la.norm(x0)\n",
    "\n",
    "for i in range(40):\n",
    "    x = A @ x\n",
    "    nrm = la.norm(x)\n",
    "    x = x/nrm\n",
    "    print(x)\n",
    "\n",
    "print('u1_exact = ' ,eigv[:,2])\n",
    "print('u2_exact = ' ,eigv[:,3])\n",
    "\n",
    "print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))\n",
    "\n",
    "print(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, power method converges to:\n",
    "\n",
    "$$ {\\bf x}_k = (\\lambda_1)^k  \\alpha_1 {\\bf u}_1 + (\\lambda_1)^k  \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\alpha_2 {\\bf u}_2 + (\\lambda_1)^k \\left[   \\left(\\frac{\\lambda_3}{\\lambda_1}\\right)^k \\alpha_3{\\bf u}_3 +  ...  \\right] $$\n",
    "\n",
    "However if $|\\lambda_1| = |\\lambda_2| $ and $ \\lambda_1, \\lambda_2 > 0 $, we get:\n",
    "\n",
    "$$ {\\bf x}_k = (\\lambda_1)^k \\left( \\alpha_1 {\\bf u}_1 + \\alpha_2 {\\bf u}_2 \\right) + \\left[ ...  \\right] $$\n",
    "\n",
    "and hence the solution of power iteration will converge to a multiple of the linear combination of the eigenvectors ${\\bf u}_1$ and ${\\bf u}_2$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### When dealing with dominant eigenvalues of multiplicity greater than 1: $|\\lambda_1| = |\\lambda_2| $ and $ \\lambda_1, \\lambda_2 < 0 $\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n = 4\n",
    "\n",
    "D = np.diag([-5,-5,2,1])\n",
    "\n",
    "A = U@D@U.T\n",
    "\n",
    "eigl, eigv = la.eig(A)\n",
    "\n",
    "print(eigl)\n",
    "print(eigv[:,2])\n",
    "print(eigv[:,3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x = x0/la.norm(x0)\n",
    "\n",
    "for i in range(40):\n",
    "    x = A @ x\n",
    "    nrm = la.norm(x)\n",
    "    x = x/nrm\n",
    "    print(x)\n",
    "\n",
    "print('u1_exact = ' ,eigv[:,2])\n",
    "print('u2_exact = ' ,eigv[:,3])\n",
    "\n",
    "print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))\n",
    "\n",
    "print(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, power method converges to:\n",
    "\n",
    "$$ {\\bf x}_k = (\\lambda_1)^k  \\alpha_1 {\\bf u}_1 + (\\lambda_1)^k  \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\alpha_2 {\\bf u}_2 + (\\lambda_1)^k \\left[   \\left(\\frac{\\lambda_3}{\\lambda_1}\\right)^k \\alpha_3{\\bf u}_3 +  ...  \\right] $$\n",
    "\n",
    "However if $|\\lambda_1| = |\\lambda_2| $ and $ \\lambda_1, \\lambda_2 < 0 $, we get:\n",
    "\n",
    "$$ {\\bf x}_k = \\pm |\\lambda_1|^k \\left( \\alpha_1 {\\bf u}_1 + \\alpha_2 {\\bf u}_2 \\right) + \\left[ ...  \\right] $$\n",
    "\n",
    "and hence the solution of power iteration will converge to a multiple of the linear combination of the eigenvectors ${\\bf u}_1$ and ${\\bf u}_2$, but the signs will flip at each step of the iterative method."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### When dealing with dominant eigenvalues of multiplicity greater than 1: $|\\lambda_1| = |\\lambda_2| $ and $ \\lambda_1 , \\lambda_2 $ have opposite signs\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n = 4\n",
    "\n",
    "D = np.diag([-5,5,2,1])\n",
    "\n",
    "A = U@D@U.T\n",
    "\n",
    "eigl, eigv = la.eig(A)\n",
    "\n",
    "print(eigl)\n",
    "print(eigv[:,0])\n",
    "print(eigv[:,1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x = x0/la.norm(x0)\n",
    "\n",
    "for i in range(40):\n",
    "    x = A @ x\n",
    "    nrm = la.norm(x)\n",
    "    x = x/nrm\n",
    "    print(x)\n",
    "\n",
    "print('u1_exact = ' ,eigv[:,2])\n",
    "print('u2_exact = ' ,eigv[:,3])\n",
    "\n",
    "print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))\n",
    "\n",
    "print(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, power method converges to:\n",
    "\n",
    "$$ {\\bf x}_k = (\\lambda_1)^k  \\alpha_1 {\\bf u}_1 + (\\lambda_1)^k  \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\alpha_2 {\\bf u}_2 + (\\lambda_1)^k \\left[   \\left(\\frac{\\lambda_3}{\\lambda_1}\\right)^k \\alpha_3{\\bf u}_3 +  ...  \\right] $$\n",
    "\n",
    "However if $|\\lambda_1| = |\\lambda_2| $, $ \\lambda_1, \\lambda_2$ have opposite signs, we get:\n",
    "\n",
    "$$ {\\bf x}_k = \\pm |\\lambda_1|^k \\left( \\alpha_1 {\\bf u}_1 \\pm \\alpha_2 {\\bf u}_2 \\right) + \\left[ ...  \\right] $$\n",
    "\n",
    "and hence power iteration does not converge to one solution. Indeed, the method oscilates between two linear combination of eigenvectors, and fails to give the correct eigenvalue. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Summary - Pitfalls of power iteration:\n",
    "\n",
    "- Risk of eventual overflow. Use normalized power iteration to avoid this.\n",
    "\n",
    "\n",
    "- If the initial guess has $\\alpha_1 = 0$, the method will converge to multiple of eigenvector ${\\bf u}_2$ if infinite precision computation is used. In practice (in finite precision computations), this will not be an issue, and the method will converge to multiple of eigenvector ${\\bf u}_1$.\n",
    "\n",
    "\n",
    "- If the two largest eigenvalues are equal in magnitude, power iteration will converge to a vector that is a linear combination of the corresponding eigenvectors (or fail to converge). This is a real problem that cannot be discounted in practice. Other methods should be used in this case.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Estimating the eigenvalue"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We want to approximate the eigenvalue ${\\bf u}_1$ using the solution of power iteration\n",
    "\n",
    "$$ {\\bf x}_k = (\\lambda_1)^k  \\alpha_1 {\\bf u}_1 + (\\lambda_1)^k  \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\alpha_2 {\\bf u}_2 + (\\lambda_1)^k \\left[   \\left(\\frac{\\lambda_3}{\\lambda_1}\\right)^k \\alpha_3{\\bf u}_3 +  ...  \\right] $$\n",
    "\n",
    "\n",
    "$ {\\bf x}_k $ approaches a multiple of the eigenvector ${\\bf u}_1$ as $k \\rightarrow \\infty$, hence\n",
    "\n",
    "$$ {\\bf x}_k  =   (\\lambda_1)^k  \\alpha_1 {\\bf u}_1 $$\n",
    "\n",
    "but also  \n",
    "\n",
    "$$ {\\bf x}_{k+1}  =   (\\lambda_1)^{k+1}  \\alpha_1 {\\bf u}_1 \\Longrightarrow {\\bf x}_{k+1} = \\lambda_1 {\\bf x}_{k} $$\n",
    "\n",
    "We can then approximate $\\lambda_1$ as the ratio of corresponding entries of the vectors ${\\bf x}_{k+1}$ and ${\\bf x}_{k}$, i.e., \n",
    "\n",
    "$$ \\lambda_1 \\approx \\frac{({\\bf x}_{k+1})_j } { ({\\bf x}_{k})_j }$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Error of Power Iteration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define the approximated eigenvector as \n",
    "\n",
    "$$ {\\bf u}_{approx} = \\frac{{\\bf x}_k } { (\\lambda_1)^k  \\alpha_1} $$\n",
    "\n",
    "and hence the error becomes the part of the power iteration solution that was neglected, i.e.,\n",
    "\n",
    "$$ {\\bf e} =  {\\bf u}_{approx} - {\\bf u}_1 = \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\frac{\\alpha_2}{\\alpha_1} {\\bf u}_2 +  \\left[   \\left(\\frac{\\lambda_3}{\\lambda_1}\\right)^k \\frac{\\alpha_3}{\\alpha_1}{\\bf u}_3 +  ...  \\right]  $$\n",
    "\n",
    "and when $k$ is large, we can write (again, we are assuming that $|\\lambda_1| > |\\lambda_2|  \\ge |\\lambda_3|  \\ge |\\lambda_4| ... $ \n",
    "\n",
    "$${\\bf e}_k \\approx \\left(\\frac{\\lambda_2}{\\lambda_1}\\right)^k \\frac{\\alpha_2}{\\alpha_1} {\\bf u}_2 $$\n",
    "\n",
    "And when we take the norm of the error\n",
    "\n",
    "$$||{\\bf e}_k|| \\approx \\left|\\frac{\\lambda_2}{\\lambda_1}\\right|^k \\left|\\frac{\\alpha_2}{\\alpha_1}\\right| ||{\\bf u}_2 || \\rightarrow ||{\\bf e}_k|| = O\\left(\\left|\\frac{\\lambda_2}{\\lambda_1}\\right|^k \\right)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Convergence of Power Iteration\n",
    "\n",
    "We want to see what happens to the error from one iteration of the other of power iteration\n",
    "\n",
    "$$ \\frac{||{\\bf e}_{k+1}||}{||{\\bf e}_{k}||} = \n",
    "\\frac{\\left|\\frac{\\lambda_2}{\\lambda_1}\\right|^{k+1} \\left|\\frac{\\alpha_2}{\\alpha_1}\\right|  }{\\left|\\frac{\\lambda_2}{\\lambda_1}\\right|^k \\left|\\frac{\\alpha_2}{\\alpha_1}\\right| } = \\frac{\\lambda_2}{\\lambda_1} $$ \n",
    "\n",
    "Or in other words, we can say that the error decreases by a **constant** value, given as $\\frac{\\lambda_2}{\\lambda_1} $, at each iteration.\n",
    "\n",
    "** Power method has LINEAR convergence! **\n",
    "\n",
    "$$ ||{\\bf e}_{k+1}|| = \\frac{\\lambda_2}{\\lambda_1} ||{\\bf e}_{k}||$$   \n",
    "or we can also write $$ ||{\\bf e}_{k+1}|| = \\left(\\frac{\\lambda_2}{\\lambda_1} \\right)^k||{\\bf e}_{0}||$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Simple Example:\n",
    "Suppose you are given a matrix with eigenvalues:\n",
    "\n",
    "$$[3,4,5]$$\n",
    "\n",
    "You use normalized power iteration to approximate one of the eigenvectors, which is given as ${\\bf x}$, and we assume $||{\\bf x} || = 1$.\n",
    "\n",
    "You knew the norm of the error of the initial guess was given as\n",
    "\n",
    "$$|| {\\bf e}_0 || = ||{\\bf x} - {\\bf x}_0 || = 0.3 $$\n",
    "\n",
    "How big will be the error after three rounds of power iteration? (Since all vectors have norm 1, the absolute and relative error are the same)\n",
    "\n",
    "\n",
    "$$|| {\\bf e}_3 || = \\left| \\frac{4}{5} \\right|^3 || {\\bf e}_0 || = 0.3 \\left| \\frac{4}{5} \\right|^3  = 0.1536 $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Convergence plots"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "n=4\n",
    "\n",
    "lambda_array_ordered = [7, 3, -2, 1]\n",
    "\n",
    "X = np.random.rand(n,n)\n",
    "U,_ = sla.qr(X)\n",
    "D = np.diag(lambda_array_ordered)\n",
    "A = U@D@U.T\n",
    "eigl, eigv = la.eig(A)\n",
    "\n",
    "eig_index_sort = np.argsort(abs(eigl))\n",
    "eigpos = eig_index_sort[-1]\n",
    "u1_exact = eigv[:,eigpos]\n",
    "\n",
    "print('Largest lambda = ', lambda_array_ordered[0])\n",
    "print('Eigenvector = ', u1_exact)\n",
    "print('Convergence rate = ', lambda_array_ordered[1]/lambda_array_ordered[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Generate normalized initial guess\n",
    "x0 = np.random.random(n)\n",
    "x = x0/la.norm(x0)\n",
    "\n",
    "count = 0\n",
    "diff  = 1\n",
    "eigs  = [x[0]]\n",
    "error = [np.abs( eigs[-1]  - lambda_array_ordered[0] )]\n",
    "\n",
    "# We will use as stopping criteria the change in the\n",
    "# approximation for the eigenvalue\n",
    "\n",
    "while (diff > 1e-6 and count < 100):\n",
    "    count += 1\n",
    "    xnew = A@x #xk+1 = A xk\n",
    "    eigs.append(xnew[0]/x[0])\n",
    "    x = xnew/la.norm(xnew)    \n",
    "    diff  = np.abs( eigs[-1]  - eigs[-2] )\n",
    "    error.append( np.abs( eigs[-1]  - lambda_array_ordered[0] ) )    \n",
    "    print(\"% 10f % 2e % 2f\" %(eigs[-1], error[-1], error[-1]/error[-2])) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.semilogy(np.abs(error)) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Inverse Power iteration\n",
    "\n",
    "What if we are interested in the smaller eigenvalue in magnitude?\n",
    "\n",
    "Suppose ${\\bf x},\\lambda$ is an eigenpair of ${\\bf A}$, such that ${\\bf A}{\\bf x}  = \\lambda {\\bf x}$. What would be an eigenvalue of  ${\\bf A}^{-1}$?\n",
    "\n",
    "\n",
    "$${\\bf A}^{-1}{\\bf A}{\\bf x}  = {\\bf A}^{-1}\\lambda {\\bf x}$$\n",
    "\n",
    "$${\\bf I}{\\bf x}  =  \\lambda {\\bf A}^{-1} {\\bf x}$$\n",
    "\n",
    "$$\\frac{1}{\\lambda}{\\bf x}  =  {\\bf A}^{-1} {\\bf x}$$\n",
    "\n",
    "\n",
    "** Hence $\\frac{1}{\\lambda}$ is an eigenvalue of ${\\bf A}^{-1} $ **.\n",
    "\n",
    "If we want to find the smallest eigenvalue in magnitude of ${\\bf A}$, we can perform power iteration using the matrix ${\\bf A}^{-1}$ to find $\\bar\\lambda = \\frac{1}{\\lambda}$, where  $\\bar\\lambda$ is the largest eigenvalue of ${\\bf A}^{-1}$.\n",
    "\n",
    "Let's implement that:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n = 4\n",
    "D = np.diag([5,-1,2,7])\n",
    "\n",
    "A = U@D@U.T\n",
    "eigl, eigv = la.eig(A)\n",
    "\n",
    "eig_index_sort = np.argsort(abs(eigl))\n",
    "eig_index_sort\n",
    "eigpos = eig_index_sort[0]\n",
    "\n",
    "print(eigv[:,eigpos])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x0 = np.random.random(n)\n",
    "nrm = la.norm(x0)\n",
    "x = x0/nrm\n",
    "\n",
    "for i in range(20):\n",
    "    x = la.inv(A)@x\n",
    "    x = x/la.norm(x)\n",
    "\n",
    "print(\"lambdan = \",x.T@A@x/(x.T@x))\n",
    "print(\"un = \", x) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Can you find ways to improve the code snippet above? "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#Inverse Power iteration to get smallest eigenvalue\n",
    "x0 = np.random.random(n)\n",
    "nrm = la.norm(x0)\n",
    "x = x0/nrm\n",
    "P, L, Um = sla.lu(A)\n",
    "for k in range(20):\n",
    "    y = sla.solve_triangular(L, np.dot(P.T, x), lower=True)\n",
    "    x = sla.solve_triangular(Um, y)\n",
    "    x = x/la.norm(x)\n",
    "\n",
    "print(\"lambdan = \",x.T@A@x/(x.T@x))\n",
    "print(\"un = \", x)  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Inverse Shifted Power Iteration\n",
    "\n",
    "What if we want to find another eigenvalue that is not the largest or the smallest? \n",
    "\n",
    "Suppose ${\\bf x},\\lambda$ is an eigenpair of ${\\bf A}$, such that ${\\bf A}{\\bf x}  = \\lambda {\\bf x}$. We want to find the eigenvalues of the shifted inverse matrix $({\\bf A} - \\sigma{\\bf I})^{-1}$\n",
    "\n",
    "\n",
    "$$({\\bf A} - \\sigma{\\bf I})^{-1}{\\bf x}  = \\bar\\lambda {\\bf x}$$\n",
    "\n",
    "$${\\bf I}{\\bf x}  =  \\bar\\lambda ({\\bf A} - \\sigma{\\bf I}) {\\bf x} = \\bar\\lambda ({\\lambda \\bf I} - \\sigma{\\bf I}) {\\bf x}$$\n",
    "\n",
    "$$  \\bar\\lambda  = \\frac{1}{\\lambda-\\sigma}$$\n",
    "\n",
    "\n",
    "We could write the above eigenvalue problem as \n",
    "\n",
    "\n",
    "$$ {\\bf B}^{-1}{\\bf x}  = \\bar\\lambda {\\bf x}$$\n",
    "\n",
    "which can be solved by inverse power iteration, which will converge to the eigenvalue $\\frac{1}{\\lambda-\\sigma}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n = 4\n",
    "D = np.diag([5,-7,2,10])\n",
    "\n",
    "A = U@D@U.T\n",
    "eigl, eigv = la.eig(A)\n",
    "\n",
    "print(eigl)\n",
    "eigv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#Shifted Inverse Power iteration \n",
    "sigma = 1\n",
    "\n",
    "x0 = np.random.random(n)\n",
    "nrm = la.norm(x0)\n",
    "x = x0/nrm\n",
    "B = A - sigma*np.eye(n)\n",
    "P, L, Um = sla.lu(B)\n",
    "for k in range(20):\n",
    "    y = sla.solve_triangular(L, np.dot(P.T, x), lower=True)\n",
    "    x = sla.solve_triangular(Um, y)\n",
    "    x = x/la.norm(x)\n",
    "\n",
    "print(\"lambdan = \",x.T@A@x/(x.T@x))\n",
    "print(\"un = \", x)  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Computational cost and convergence\n",
    "\n",
    "- power iteration: to obtain largest eigenvalue in magnitude ${\\lambda_1}$\n",
    "    - Matrix-vector multiplications at each iteration: $O(n^2)$\n",
    "    - convergence rate: $\\left|\\frac{\\lambda_2}{\\lambda_1} \\right|$\n",
    "    \n",
    "   \n",
    "- inverse power iteration: to obtain smallest eigenvalue in magnitude ${\\lambda_n}$\n",
    "    - only one factorization: $O(n^3)$\n",
    "    - backward-forward substitutions to solve at each iteration: $O(n^2)$\n",
    "    - convergence rate: $\\left|\\frac{\\lambda_n}{\\lambda_{n-1}} \\right|$  \n",
    "    \n",
    "   \n",
    "- inverse shifted power iteration: to obtain an eigenvalue close to a known/given value $\\sigma$\n",
    "    - only one factorization: $O(n^3)$\n",
    "    - backward-forward substitutions to solve at each iteration: $O(n^2)$\n",
    "    - convergence rate: $\\left|\\frac{\\lambda_c - \\sigma}{\\lambda_{c2} - \\sigma} \\right|$  \n",
    "    where $\\lambda_c$ is the closest eigenvalue to $\\sigma$ and $\\lambda_{c2}$ is the second closest eigenvalue to $\\sigma$.\n",
    "   "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
